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ABSTRACT: The theme of ‘Managing the digital transformation of the construction industry’ emphasises the 
importance of considering various dimensions of digitalisation and optimising the built environment. This review 
aims to present methodological approaches from existing literature that elucidate location-related factors 
impacting the capital cost of data centres. These findings facilitate adjustments to historical cost data when 
estimating total costs for new data centres. A systematic literature review method was employed to ensure an 
objective and comprehensive synthesis. In conjunction with Bayes's theory, this review identifies that a Delphi 
methodology is the most suitable methodological approach for forecasting and modelling capital expenditure for 
hyper-scale data centres. The methodology enables collective decision-making and consensus building, 
recognising the stakeholder's pivotal role in shaping the future of data centres. These findings offer valuable 
insights for researchers and practitioners in forming a methodological approach for further investigations into the 
location-related factors impacting the capital cost of data centres. Embracing this knowledge allows us to align 
research and practice, ensuring that these practices become integral to shaping the future of data centres and the 
digitalisation and optimisation of the built environment. 
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1. INTRODUCTION 


The rapid expansion of digital technologies requires buildings (called Data centres) to house information 
technology (IT) equipment to store and process data and services required by digital transformation, including the 
internet. Due to the advantages such as advanced technological progress in the sector and the cold climate 
conditions, certain regions of the world, such as the Nordic regions, are preferred by investors to build Data centres. 
This presents unprecedented challenges to construction cost consulting professionals in providing reliable capital 
cost estimates as early as a potential (international) location is identified. In the very early stage of a project 
opportunity, cost consultants provide capital expenditure input to support development appraisal exercises which 
estimate the residual land value and input to the Order of Cost estimate involved ‘in determining the possible cost 
of a building(s) in relation to the employer’s fundamental requirements’ (RICS, 2013). 


As these activities occur before preparing a complete set of working drawings (RICS, 2013), capital expenditure 
is estimated by benchmarking cost data from previously completed similar projects. This involves comparing and 
contrasting the difference between historical and proposed projects concerning the cost-significant variables such 
as location, building size, market conditions and their impact on capital expenditure. Existing literature reveals 
generic cost modelling approaches that could be used in early cost estimates and details of cost-significant 
variables that need to be considered during cost modelling (Parameswaran et al., 2019; Hashemi et al., 2020). 


However, as data centres are relatively new to the construction sector and their design and construction 
significantly depend on the location (King et al., 2023), the suitability of the generic cost modelling approaches 
has yet to be widely investigated. Therefore, particularly regarding the conference theme and the growth of the 
internet, more research is required to establish the impact of site location on the capital expenditure of hyper-scale 
data centres; this will assist in selecting the correct location to make informed decisions and reduce the financial 
risk and contingency estimate to ensure a more accurate construction cost. This paper aims to present findings of 
a systematic literature review to determine the theoretical and methodological approaches in existing literature 
concerning the location-related factors affecting the capital cost of data centres that could be used to adjust 
historical cost data during their use in estimating the total cost for new data centre projects. 
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SECTION B - ADVANCED PROJECT MANAGEMENT AND CONTROL 


2. MATERIALS AND METHODS 


2.1 Approach 


A systematic approach has been used to identify and synthesise the literature results to ensure an accurate, unbiased 
synthesis. It is an approach where literature on a complex topic has been conceptualised and studied differently 
among researchers (Greenhalgh et al., 2005). This review identifies methodological approaches, geographies, 
historical development, quality, and literature validity. 


2.2 Scoping Strategy 


The literature search strategy utilised a scoping review based on that as derived from PRISMA (Tricco et al., 2018) 
and to provide rigour to justify further research (McInnes et al., 2018). The search strategy used the advanced 
search tool with Boolean keyword operators. In total, 1,375 studies were identified. After an initial review of the 
abstract of the papers, 508 were identified as being focused on construction, data centres and cost variables. From 
those identified as suitable, 87 were identified as duplicated, reducing the number of papers for review to 421. As 
Suarez-Almazor et al., (2000) suggested, it is vital to utilise a second database to identify potential inconsistencies. 
In addition, it may further enhance and support the literature review with newly identified literature. Using the 
same search criteria as the stage 1 search, a further 1,623 studies were identified; after an initial review of the 
abstract of the papers, 402 were identified as being focused on construction, data centres and cost variables. From 
those identified as suitable, 251 were identified as duplicated from the initial stage 1 search, further reducing the 
number of papers for an abstract review to 151, bringing the total for abstract review to 572. Following an abstract 
and full text review a total of 161 studies were selected for final review, as Figure 1. 


Stage l Stage 2 
| Scopus Google Scholar 
Ea 4 a 
Initial search Initial search 
1375 studeecs 1623 studies 
After exclusion criteria After exclusion criteria 
508 studies 402 studies 
y% : 
After duplication After duplication 
42) studies 151 studies 
After abstract review After abstract review 

293 studies 71 studics 


After full-text review 
22 studies 


After full-text review 
139 studies 


161 studics 
for final review 


Fig 1. Systematic approach for literature 


2.3 Validity and quality of literature 


To assess validity and quality, the papers have been analysed and identified against peer-reviewed literature and 
grey literature, as it is recognised that the inclusion of grey literature in systematic reviews provides rigour and 
balance of recognised sources of information (McAuley et al., 2000; Blackhall, 2007). Whilst grey literature means 
many things to many people (Mahood et al., 2014), this review identifies grey literature as being book chapters, 
conference proceedings and trade publications. According to McAuley et al., (2000), the review process for a 
meta-analysis should strive to locate and incorporate various reports, including both published and grey literature, 
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that satisfy pre-established criteria for inclusion. In our systematic literature review, we comprehensively searched 
literature and identified 161 papers for final review. The review process assessed the literature's validity and 
quality, including both peer-reviewed and grey literature sources such as book chapters, conference proceedings, 
and trade publications as identified in Table 1. We found that 84% of the selected literature was peer-reviewed 
journals, while the remaining 16% comprised other sources. 


Table 1. Sources of literature 


Source Frequency % of total 
Book Chapters 9 6% 
Conference proceedings 10 6% 

Peer reviewed journals 136 84% 
Trade publications 6 4% 
Totals 161 100% 


3. RESULTS AND DISCUSSION 


3.1 Methodological Approaches to Cost Modelling 


To fully understand the methodological approaches utilised in research, provide data on their use by researchers in 
previous studies; this identifies the approach taken in each study for synthesising the data that may be useful for 
future studies. Analysing the abstracts identified methodological approaches in the selected literature from the 
scoping strategy; this meta-narrative has shown that the prediction method has the highest count across all sectors. 
The prediction method has been used significantly in modelling data centre costs. The other vital approaches 
include machine learning, heuristic, stochastic method, parametric modelling, AHP, Regression Analysis and 
Monte Carlo simulation. It is worth noting that some papers identified Machine Learning, and some artificial 
neural network techniques, whilst others used neural network techniques. Due to the similarity of the techniques 
and neural networks forming a subset of machine learning, we have grouped these in the Machine Learning 
category. Likewise, several papers identified similar techniques whilst others identified heuristic techniques; again, 
we have grouped these in the heuristic category due to the similarity of these techniques. 


When analysing what methodological approaches are specific to the data centre sector by eliminating other 
construction sectors resulting from the scoping search, the results identified 59 different approaches related to data 
centres. These results demonstrate that prediction methodology holds the highest vote count. This methodological 
approach aligns with the vote count trend for the prediction method. It is acknowledged that prediction theory is 
not an absolute exact science and ‘can be compared to weather forecasting, stock market predictions or ‘betting 
on how fast a 100-meter foot race will be run’ (Line, 2008). Prediction theory also requires a substantial quantity 
of data to enable prediction. Advanced modelling techniques are extensively used in cost modelling to improve 
accuracy. One of the most recent advancements in Machine Learning-based approaches. According to a recent 
systematic review (Hashemi et al., 2020), ANN and Regression Analysis were identified as the most widely used 
ML-based cost modelling techniques, followed by hybrid models such as ANN with fuzzy logic, CBR and GA 
(Genetic Algorithm). Machine Learning involves developing a machine-based system that can learn from data. A 
large volume of historical data is paramount for a machine-learning model. 


As data centres are relatively new, developing a machine learning-based model is not feasible at this early stage 
when historical cost data is limited. Fazil et al., (2021) demonstrate that obtaining a reasonably accurate neural 
network prediction is possible even when insufficient information is available during the initial design. However, 
Gunaydin and Dogan (2004) argue that the accuracy that a cost estimation neural network model strongly relies 
on the quality and quantity of data samples used. They claim that more data samples lead to less prediction error. 
Therefore, to create an accurate cost prediction model for building projects, it is necessary to have reliable and 
high-quality cost data for various types and conditions of buildings. Case-based reasoning is another potential 
method for cost prediction, which involves retrieving information from historical data on similar or identical cases. 
However, there are challenges associated with the retrieval process, such as computing similarity measures. 
According to Rashid's research (2017), case-based reasoning is an effective method for predicting costs as it 
involves analysing past cases' attributes, thereby enhancing cost prediction accuracy. However, these models 
mainly rely on historical cost data. In the UK, the Building Cost Information Service (RICS, 2018) offers 
information on construction projects and their corresponding tender prices, and cost managers use this data to 
estimate the cost of a building based on the cost of a similar project with adjustments to reflect any differences. 
However, it does not enable generalisations about the relationships between cost and significant predictors. Lowe 
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et al. (2016) conducted a research study, creating a dependable regression cost model that can be used to estimate 
the construction expenses associated with a building's final account. They highlight that, aside from its practical 
usefulness, creating such a model serves two other purposes. First, it provides a benchmark for evaluating the 
effectiveness of neural network models, and second, it helps identify the variables that display a significant linear 
correlation with cost. However, the effectiveness of these prediction methods has its limitations. 


Regression techniques require a substantial quantity of statistical information, and their precision is affected by 
the supposition that the independent variables are both independent of each other and normally distributed (Son et 
al., 2012). In contrast, according to Zhang (2003), neural networks possess a crucial benefit over regression models 
because they can model nonlinear connections without relying on assumptions. Regression methods demand a 
significant amount of statistical data, and their accuracy is influenced by the assumption that the independent 
variables are independent and normally distributed (Son et al., 2012). In contrast, the primary advantage of neural 
networks over regression models is their capacity to model nonlinear relationships without relying on any 
assumptions (Zhang, 2003). However, building a neural network model also requires data, and designing an 
optimal network structure involves a costly trial-and-error process. Therefore, according to Son et al. (2012), there 
is a notable need for prediction techniques that are more robust and reliable. Likewise, acquiring input data for 
preparing estimates can be challenging. According to Hashemi et al., (2020), in cases where the extent of the work 
could be better understood, it could result in inaccurate and approximate cost estimates. Whilst it is acknowledged 
that few studies focus specifically on selecting suitable sites for data centres (Kheybari et al., 2020), the search 
identified one Delphi study for data centre projects in China as a method for selecting data centres for several 
cities. However, the main findings identified proximity and geographical locations as having the only impact (Yang 
& Ye, 2011). According to King et al. (2023), in the absence of data for assessing the impact of location variables 
for hyperscale data centres, a consensus will need to be obtained from industry experts to obtain the data. 


Whilst Delphi has the lowest vote count, as an approach to forming a consensus, Delphi is an appropriate route. 
This literature review has identified that utilising voting as the ameliorated nominal group technique could be an 
alternative use of Delphi. According to Brauers (2018), the nominal group technique may help generate ideas about 
objectives that could be included in an initial version of the Delphi method. This could facilitate convergence 
towards a final list of objectives. Whilst other top-voted methodological approaches require a substantial amount 
of data to establish and make predictions for capital expenditure, a Delphi study is well suited to establish 
consensus to identify the impact of location variables in the case of Data centres where available published data is 
limited. Some scholars argue that the Delphi method lacks a well-established framework (Crisp et al., 1997; 
Sharkey & Sharples, 2001; Broomfield & Humphris, 2001; Turoff & Linstone, 2002; Campbell et al., 2004; Hsu 
& Sandford, 2007). 


However, Delphi could be used only to identify location-related variables impacting the capital costs of data 
centres. In addition, the Delphi technique can also be integrated with Bayes theory to update established opinions 
through the probability of arriving at different outcomes, as expert opinions are collected through a structured 
sample collection technique to estimate these probable outcomes. Bayesian statistics is based on the theory 
produced by Thomas Bayes (1763); it is characterised by a joint treatment of all quantities of interest in a statistical 
model as random variables. In particular, Bayesian statistics naturally incorporate the uncertainty analysis 
surrounding the estimates or forecasts described in terms of probability distributions, As Figure 2. 


P(A) - P(BIA) 


P(AIB) = Sa 


Figure 2. Bayes theory 


e P(B) denotes the prior belief (for example, the probability of occurrence of the variable, such as the 
probability of encountering ground conditions) 


e P(B|A) denotes the level of impact should that variable occur 


e P(B) denotes the new evidence 


The information obtained by the Delphi study can be fed into the Bayes formula to render current outcomes based 
on the updated information as provided by a qualitative assessment of the perceived impact of location variables. 
The combination of Bayes theory and the Delphi method enhances the accuracy and decisiveness of the 
mathematical model when compared directly with Prediction Theory. It is worth noting that whilst most literature 
identifies the Delphi method as a tool for knowledge elicitation, it is in the author's opinion that Delphi is a 
methodological approach in its own right due to its systematic nature, potential for quantitative analysis, iterative 
feedback process, incorporation of expert judgment, and consideration of uncertainty make it comparable to other 
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methodological approaches, this is also supported by the seminal work of Hasson et al., (2000). For example, while 
primarily used for knowledge elicitation from experts, the Delphi method is a systematic and structured approach 
to gathering and aggregating opinions and judgments. It involves multiple iterations of anonymous surveys or 
questionnaires to collect insights from a panel of experts. While other methods might use probabilistic models, 
Statistical analysis, or simulation techniques to quantify uncertainty, the Delphi method focuses on expert 
consensus and convergence to address uncertainty. These different approaches to uncertainty management can be 
compared and evaluated based on their effectiveness and suitability for a particular cost modelling context. 


To assess the validity of the findings, we analysed book chapters, conference proceedings, peer-reviewed journals 
and trade publications against the data centre sector and the relationship between the various methodological 
approaches. This indicates that 77% of the findings were from peer-reviewed journals, with 23% being from grey 
literature. As a further analysis, we reviewed the country of research to establish if there were any other research 
gaps in specific regions or countries; this highlighted that there needs to be an identified approach in the UK. 
Whilst the list of methodological approaches identified is informative, it is essential to highlight our study's 
significant contributions and novel aspects compared to previous research in the broader field of cost modelling. 
Unlike previous studies, our research specifically focuses on the context of data centres, a relatively new domain 
within the construction sector. Data centres present unique challenges due to their dependency on location factors. 
Therefore, our study investigates the impact of site location on capital expenditure, addressing a crucial knowledge 
gap in the literature and aligning itself accordingly with constructing for the future. By exploring this specific 
context, we provide valuable insights that can assist decision-makers in making informed choices, mitigating 
financial risks, and enhancing the accuracy of construction cost estimates for data centres. 


3.2 Location Specific Factors 


We have examined whether there is a relationship between location-specific factors and location-specific factors 
influencing cost models, or do cost models influence location choices? This relationship is a crucial matter of 
concern in the decision-making process, as it involves understanding whether location-specific factors influence 
cost models or if cost models influence location choices. There are two key influences, 1) The influence of 
location-specific factors on cost models and 2) the influence of cost models on location choices. For example, high 
land prices in certain areas may increase site acquisition costs, affecting the overall project budget. Similarly, 
regions with high labour costs may result in higher construction expenses. Additionally, proximity to reliable 
power sources or fibre optic networks can impact energy costs and connectivity expenses. 


Understanding the influence of these location-specific factors on cost models is crucial for accurate budget 
estimation and financial planning during the decision-making process. By incorporating this knowledge into the 
cost models, stakeholders can make informed choices regarding the site location, considering the potential impact 
on capital expenditure. Secondly, cost models can also influence location choices for data centre projects. These 
cost models allow stakeholders to evaluate potential site locations' financial viability and profitability based on 
projected construction costs, operational expenses, and expected returns on investment. Cost models typically 
consider political influences, land and construction costs, energy expenses, maintenance and operational costs, 
taxes, and potential revenue streams (Baloi & Price, 2003). By analysing cost models, stakeholders can compare 
different location options and assess the financial implications associated with each choice. This analysis enables 
them to prioritise locations that align with their budgetary constraints and desired profitability targets. They can 
provide insights into the cost-effectiveness of various site locations and guide decision-makers in selecting the 
most favourable option. The relationship between location-specific factors and cost models in data centre 
construction is bidirectional. Location-specific factors influence cost models by directly impacting various cost 
components. Simultaneously, cost models play a crucial role in guiding location choices by providing financial 
insights and evaluating the viability and profitability of potential sites. 


In addition, we have compared the data centre sector to other sectors, demonstrating that other sectors also consider 
location and location-specific factors when cost modelling. For instance, in the retail industry, location plays a 
crucial role in determining the viability and profitability of a store, as researchers have found that factors such as 
population density, income levels, competition, and proximity to transportation hubs significantly influence the 
cost modelling approach for retail establishments (Kerin & Harvey, 1975; Brown, 1993). Similarly, in the real 
estate sector, location-specific factors are vital for estimating property values and rental rates, with research 
suggesting that variables such as neighbourhood quality, accessibility to amenities, proximity to schools, and crime 
rates directly affect residential and commercial properties (Klimczak, 2010). Furthermore, in the transportation 
sector, location-related factors impact cost modelling approaches, such as when estimating the costs of 
constructing highways or rail networks, factors such as topography, soil conditions, presence of natural obstacles, 
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and proximity to existing infrastructure play a significant role (Daniels & Mulley, 2012). These examples 
demonstrate that various sectors, including retail, real estate, and transportation, recognise the influence of location 
and location-specific factors when cost modelling. 


4. CONCLUSIONS 


By analysing the methodological approaches through the systematic review, we have established trends in the 
literature and identified what methods are being utilised together. For example, we have identified the Delphi 
method as a structured and iterative approach that involves collecting and synthesising expert opinions to make 
informed decisions. In investigating the impact of site location on capital expenditure, the Delphi method can help 
gather insights from a panel of experts regarding the relationship between location factors and construction costs. 
By utilising the Delphi method, we can tap into the collective wisdom of experts in the field and gain insights into 
the impact of site location on capital expenditure. The Delphi method helps to mitigate biases and provides a more 
comprehensive understanding of the relationships between location factors and construction costs. Likewise, 
Bayes's theory is a statistical approach that allows for incorporating prior knowledge and updating probabilities 
based on new evidence. It provides a framework to quantify uncertainty and make probabilistic inferences. 
Applying Bayesian theory to investigate the impact of site location on capital expenditure involves formulating 
and updating probability distributions based on available data and expert opinions. By applying Bayesian theory, 
we can incorporate prior knowledge and new evidence to quantify the impact of site location on capital 
expenditure. This approach allows for a more nuanced and probabilistic assessment, considering the inherent 
uncertainties in the relationship between location factors and construction costs. The Delphi method and Bayesian 
theory provide valuable tools to investigate the impact of site location on capital expenditure for hyperscale data 
centres. 


The Delphi method leverages expert opinions and consensus-building, while Bayesian theory incorporates 
statistical analysis and the integration of prior knowledge and data. Combining these approaches can provide a 
comprehensive understanding of the relationship between site location and construction costs in data centre 
projects. To conclude, it has been identified through this meta-narrative analysis that the synthesis of both Delphi 
Methodology and Bayes Theory is a robust methodological approach to identifying the location-related factors for 
hyperscale Data centres where variables are not fully known. The development and growth of data centres and the 
result of this research are essential to how we manage the construction industry's digital transformation. 
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